10 research outputs found
OTMatch: Improving Semi-Supervised Learning with Optimal Transport
Semi-supervised learning has made remarkable strides by effectively utilizing
a limited amount of labeled data while capitalizing on the abundant information
present in unlabeled data. However, current algorithms often prioritize
aligning image predictions with specific classes generated through
self-training techniques, thereby neglecting the inherent relationships that
exist within these classes. In this paper, we present a new approach called
OTMatch, which leverages semantic relationships among classes by employing an
optimal transport loss function. By utilizing optimal transport, our proposed
method consistently outperforms established state-of-the-art methods. Notably,
we observed a substantial improvement of a certain percentage in accuracy
compared to the current state-of-the-art method, FreeMatch. OTMatch achieves
3.18%, 3.46%, and 1.28% error rate reduction over FreeMatch on CIFAR-10 with 1
label per class, STL-10 with 4 labels per class, and ImageNet with 100 labels
per class, respectively. This demonstrates the effectiveness and superiority of
our approach in harnessing semantic relationships to enhance learning
performance in a semi-supervised setting
DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation
Few-shot learning aims to adapt models trained on the base dataset to novel
tasks where the categories are not seen by the model before. This often leads
to a relatively uniform distribution of feature values across channels on novel
classes, posing challenges in determining channel importance for novel tasks.
Standard few-shot learning methods employ geometric similarity metrics such as
cosine similarity and negative Euclidean distance to gauge the semantic
relatedness between two features. However, features with high geometric
similarities may carry distinct semantics, especially in the context of
few-shot learning. In this paper, we demonstrate that the importance ranking of
feature channels is a more reliable indicator for few-shot learning than
geometric similarity metrics. We observe that replacing the geometric
similarity metric with Kendall's rank correlation only during inference is able
to improve the performance of few-shot learning across a wide range of datasets
with different domains. Furthermore, we propose a carefully designed
differentiable loss for meta-training to address the non-differentiability
issue of Kendall's rank correlation. Extensive experiments demonstrate that the
proposed rank-correlation-based approach substantially enhances few-shot
learning performance
Beyond Keywords and Relevance: A Personalized Ad Retrieval Framework in E-Commerce Sponsored Search
On most sponsored search platforms, advertisers bid on some keywords for
their advertisements (ads). Given a search request, ad retrieval module
rewrites the query into bidding keywords, and uses these keywords as keys to
select Top N ads through inverted indexes. In this way, an ad will not be
retrieved even if queries are related when the advertiser does not bid on
corresponding keywords. Moreover, most ad retrieval approaches regard rewriting
and ad-selecting as two separated tasks, and focus on boosting relevance
between search queries and ads. Recently, in e-commerce sponsored search more
and more personalized information has been introduced, such as user profiles,
long-time and real-time clicks. Personalized information makes ad retrieval
able to employ more elements (e.g. real-time clicks) as search signals and
retrieval keys, however it makes ad retrieval more difficult to measure ads
retrieved through different signals. To address these problems, we propose a
novel ad retrieval framework beyond keywords and relevance in e-commerce
sponsored search. Firstly, we employ historical ad click data to initialize a
hierarchical network representing signals, keys and ads, in which personalized
information is introduced. Then we train a model on top of the hierarchical
network by learning the weights of edges. Finally we select the best edges
according to the model, boosting RPM/CTR. Experimental results on our
e-commerce platform demonstrate that our ad retrieval framework achieves good
performance
Beyond Keywords and Relevance: A Personalized Ad Retrieval Framework in E-Commerce Sponsored Search
On most sponsored search platforms, advertisers bid on some keywords for
their advertisements (ads). Given a search request, ad retrieval module
rewrites the query into bidding keywords, and uses these keywords as keys to
select Top N ads through inverted indexes. In this way, an ad will not be
retrieved even if queries are related when the advertiser does not bid on
corresponding keywords. Moreover, most ad retrieval approaches regard rewriting
and ad-selecting as two separated tasks, and focus on boosting relevance
between search queries and ads. Recently, in e-commerce sponsored search more
and more personalized information has been introduced, such as user profiles,
long-time and real-time clicks. Personalized information makes ad retrieval
able to employ more elements (e.g. real-time clicks) as search signals and
retrieval keys, however it makes ad retrieval more difficult to measure ads
retrieved through different signals. To address these problems, we propose a
novel ad retrieval framework beyond keywords and relevance in e-commerce
sponsored search. Firstly, we employ historical ad click data to initialize a
hierarchical network representing signals, keys and ads, in which personalized
information is introduced. Then we train a model on top of the hierarchical
network by learning the weights of edges. Finally we select the best edges
according to the model, boosting RPM/CTR. Experimental results on our
e-commerce platform demonstrate that our ad retrieval framework achieves good
performance
DREAM+: Efficient Dataset Distillation by Bidirectional Representative Matching
Dataset distillation plays a crucial role in creating compact datasets with
similar training performance compared with original large-scale ones. This is
essential for addressing the challenges of data storage and training costs.
Prevalent methods facilitate knowledge transfer by matching the gradients,
embedding distributions, or training trajectories of synthetic images with
those of the sampled original images. Although there are various matching
objectives, currently the strategy for selecting original images is limited to
naive random sampling. We argue that random sampling overlooks the evenness of
the selected sample distribution, which may result in noisy or biased matching
targets. Besides, the sample diversity is also not constrained by random
sampling. Additionally, current methods predominantly focus on
single-dimensional matching, where information is not fully utilized. To
address these challenges, we propose a novel matching strategy called Dataset
Distillation by Bidirectional REpresentAtive Matching (DREAM+), which selects
representative original images for bidirectional matching. DREAM+ is applicable
to a variety of mainstream dataset distillation frameworks and significantly
reduces the number of distillation iterations by more than 15 times without
affecting performance. Given sufficient training time, DREAM+ can further
improve the performance and achieve state-of-the-art results. We have released
the code at github.com/NUS-HPC-AI-Lab/DREAM+.Comment: This is an extension of the ICCV conference versio
Examining parental educational expectations in one of the oldest children's savings account programs in the country: The Harold Alfond College Challenge
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/153305/1/2020ChenZhangetalCYSR.pd
Research on 3D Path Planning of Quadrotor Based on Improved A* Algorithm
Considering the complexity of the three-dimensional environment and the flexibility of the quadrotor aircraft, using the traditional A* algorithm for global path planning has the disadvantages of less search direction, more expanded nodes, and a longer planning path. Therefore, an improved A* algorithm is proposed, which is improved from two aspects. Firstly, a two-layer extended neighborhood strategy is proposed, which can increase the search direction and make better use of the flexibility of the aircraft. Secondly, the heuristic function is improved to make the heuristic function value closer to the actual planning path distance, which can reduce the expansion nodes and optimize the planning path. Finally, the path planning simulation of the improved A* algorithm is carried out and the results show that the path planned by the improved algorithm is shorter and the expanded nodes are fewer, which can guide the quadrotor to reach the destination better
Identification of a 24-gene panel and a novel marker of PODXL2 essential for the pathological diagnosis of early prostate cancer
Precise diagnosis of early prostate cancer (PCa) is critical for preventing tumor progression. However, the diagnostic outcomes of currently used markers are far from satisfactory due to the low sensitivity or specificity. Here, we identified a diagnostic subpopulation in PCa tissue with the integrating analysis of single-cell and bulk RNA-seq. The representative markers of this subpopulation were extracted to perform intersection analysis with early-PCa-related gene module generated from weighted correlation network analysis (WGCNA). A total of 24 overlapping genes were obtained, the diagnostic roles of which were validated by distinguishing normal and tumorous prostate samples from the public dataset. A least absolute shrinkage and selection operator (LASSO) model was constructed based on these genes and the obtained 24-gene panel showed high sensitivity and specificity for PCa diagnosis, with better identifying capability of PCa than the commercially used gene panel of Oncotype DX. The top two risk factors, TRPM4 and PODXL2, were verified to be highly expressed in early PCa tissues by multiplex immunostaining, and PODXL2 was more sensitive and specific compared to TRPM4 and the pathologically used marker AMACR for early PCa diagnosis, suggesting a novel and promising pathology marker
Reversible dehydrogenation and rehydrogenation of cyclohexane and methylcyclohexane by single-site platinum catalyst.
Developing highly efficient and reversible hydrogenation-dehydrogenation catalysts shows great promise for hydrogen storage technologies with highly desirable economic and ecological benefits. Herein, we show that reaction sites consisting of single Pt atoms and neighboring oxygen vacancies (VO) can be prepared on CeO2 (Pt1/CeO2) with unique catalytic properties for the reversible dehydrogenation and rehydrogenation of large molecules such as cyclohexane and methylcyclohexane. Specifically, we find that the dehydrogenation rate of cyclohexane and methylcyclohexane on such sites can reach values above 32,000 molH2 molPt-1 h-1, which is 309 times higher than that of conventional supported Pt nanoparticles. Combining of DRIFTS, AP-XPS, EXAFS, and DFT calculations, we show that the Pt1/CeO2 catalyst exhibits a super-synergistic effect between the catalytic Pt atom and its support, involving redox coupling between Pt and Ce ions, enabling adsorption, activation and reaction of large molecules with sufficient versatility to drive abstraction/addition of hydrogen without requiring multiple reaction sites
Examining parental educational expectations in one of the oldest children’s savings account programs in the country: The Harold Alfond College Challenge
Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/153305/1/2020ChenZhangetalCYSR.pd